15 research outputs found

    Learning Tractable Probabilistic Models for Fault Localization

    Full text link
    In recent years, several probabilistic techniques have been applied to various debugging problems. However, most existing probabilistic debugging systems use relatively simple statistical models, and fail to generalize across multiple programs. In this work, we propose Tractable Fault Localization Models (TFLMs) that can be learned from data, and probabilistically infer the location of the bug. While most previous statistical debugging methods generalize over many executions of a single program, TFLMs are trained on a corpus of previously seen buggy programs, and learn to identify recurring patterns of bugs. Widely-used fault localization techniques such as TARANTULA evaluate the suspiciousness of each line in isolation; in contrast, a TFLM defines a joint probability distribution over buggy indicator variables for each line. Joint distributions with rich dependency structure are often computationally intractable; TFLMs avoid this by exploiting recent developments in tractable probabilistic models (specifically, Relational SPNs). Further, TFLMs can incorporate additional sources of information, including coverage-based features such as TARANTULA. We evaluate the fault localization performance of TFLMs that include TARANTULA scores as features in the probabilistic model. Our study shows that the learned TFLMs isolate bugs more effectively than previous statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI 2015

    Learning and Exploiting Relational Structure for Efficient Inference

    No full text
    Thesis (Ph.D.)--University of Washington, 2015One of the central challenges of statistical relational learning is the tradeoff between expressiveness and computational tractability. Representations such as Markov logic can capture rich joint probabilistic models over a set of related objects. However, inference in Markov logic and similar languages is #P-complete. Most existing tractable statistical relational representations are very limited in expressiveness. This dissertation explores two strategies for dealing with intractability while preserving expressiveness. The first strategy is to exploit the approximate symmetries frequently found in relational domains to perform approximate lifted inference. We provide error bounds for two approaches for approximate lifted belief propagation. We also describe propositional and lifted inference algorithms for repeated inference in statistical relational models. We describe a general approach for expected utility maximization in relational domains, making use of these algorithms. The second strategy we explore is learning rich relational representations directly from data. First, we propose a method for learning multiple hierarchical relational clusterings, unifying several previous approaches to relational clustering. Second, we describe a tractable high-treewidth statistical relational representation based on Sum-Product Networks, and propose a learning algorithm for this language. Finally, we apply state-of-the-art tractable learning methods to the problem of software fault localization

    Learning tractable statistical relational models

    No full text
    Abstract Sum-product networks (SPNs

    Efficient Belief Propagation for Utility Maximization and Repeated Inference

    No full text
    Many problems require repeated inference on probabilistic graphical models, with different values for evidence variables or other changes. Examples of such problems include utility maximization, MAP inference, online and interactive inference, parameter and structure learning, and dynamic inference. Since small changes to the evidence typically only affect a small region of the network, repeatedly performing inference from scratch can be massively redundant. In this paper, we propose expanding frontier belief propagation (EFBP), an efficient approximate algorithm for probabilistic inference with incremental changes to the evidence (or model). EFBP is an extension of loopy belief propagation (BP) where each run of inference reuses results from the previous ones, instead of starting from scratch with the new evidence; messages are only propagated in regions of the network affected by the changes. We provide theoretical guarantees bounding the difference in beliefs generated by EFBP and standard BP, and apply EFBP to the problem of expected utility maximization in influence diagrams. Experiments on viral marketing and combinatorial auction problems show that EFBP can converge much faster than BP without significantly affecting the quality of the solutions

    Efficient Lifting for Online Probabilistic Inference

    No full text
    Lifting can greatly reduce the cost of inference on firstorder probabilistic graphical models, but constructing the lifted network can itself be quite costly. In online applications (e.g., video segmentation) repeatedly constructing the lifted network for each new inference can be extremely wasteful, because the evidence typically changes little from one inference to the next. The same is true in many other problems that require repeated inference, like utility maximization, MAP inference, interactive inference, parameter and structure learning, etc. In this paper, we propose an efficient algorithm for updating the structure of an existing lifted network with incremental changes to the evidence. This allows us to construct the lifted network once for the initial inferenc

    Learning Relational Sum-Product Networks

    No full text
    Sum-product networks (SPNs) are a recently-proposed deep architecture that guarantees tractable inference, even on certain high-treewidth models. SPNs are a propositional architecture, treating the instances as independent and identically distributed. In this paper, we introduce Relational Sum-Product Networks (RSPNs), a new tractable first-order probabilistic architecture. RSPNs generalize SPNs by modeling a set of instances jointly, allowing them to influence each other's probability distributions, as well as modeling probabilities of relations between objects. We also present LearnRSPN, the first algorithm for learning high-treewidth tractable statistical relational models. LearnRSPN is a recursive top-down structure learning algorithm for RSPNs, based on Gens and Domingos' LearnSPN algorithm for propositional SPN learning. We evaluate the algorithm on three datasets; the RSPN learning algorithm outperforms Markov Logic Networks in both running time and predictive accuracy
    corecore